Text Mining from Free Unstructured Text: An Experiment of Time Series Retrieval for Volcano Monitoring

نویسندگان

چکیده

Volcanic activity may influence climate parameters and impact people safety, hence monitoring its characteristic indicators their temporal evolution is crucial. Several databases, communications literature providing data, information updates on active volcanoes worldwide are available, will likely increase in the future. Consequently, extraction text mining techniques aiming to efficiently analyze such databases gather data of interest a specific volcano can play an important role this applied science field. This work presents natural language processing (NLP) system that we developed extract geochemical geophysical from free unstructured included reports operational bulletins issued by volcanological observatories HTML, PDF MS Word formats. The NLP enables relevant gas (e.g., SO2 CO2 flux) text, was tested series 2839 daily weekly published online between 2015 2021 for Stromboli (Italy). experiment shows proves capable time set user-defined be later analyzed interpreted specialists relation with other geospatial data. potentially tuned target databases.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining criminal networks from unstructured text documents

Digital data collected for forensics analysis often contain valuable information about the suspects’ social networks. However, most collected records are in the form of unstructured textual data, such as e-mails, chat messages, and text documents. An investigator often has to manually extract the useful information from the text and then enter the important pieces into a structured database for...

متن کامل

Entity Retrieval and Text Mining for Online Reputation Monitoring

Online Reputation Monitoring (ORM) is concerned with the use of computational tools to measure the reputation of entities online, such as politicians or companies. In practice, current ORM methods are constrained to the generation of data analytics reports, which aggregate statistics of popularity and sentiment on social media. We argue that this format is too restrictive as end users often lik...

متن کامل

Text Mining for Technology Monitoring

A considerable part of scientific and technological knowledge is coded in writing. In this context, automated text categorization can be regarded as a promising tool particularly for patent data analysis. In a real-life example, we show that automated text categorization can closely resemble the time -consuming categorisation job of an expert. By comparing different algorithms we reveal systema...

متن کامل

Mining Free Text for Structure

INTRODUCTION When the manager of a mutual fund sits down to write an update of the fund's prospectus, he does not start his job from scratch. He knows what the fund's shareholders expect to see in the document and arranges the information accordingly. An inventor, ready to register his idea with the Patent and Trademark Office of the U.S. Department of Commerce, writes it up in accordance with ...

متن کامل

Design and Test of the Real-time Text mining dashboard for Twitter

One of today's major research trends in the field of information systems is the discovery of implicit knowledge hidden in dataset that is currently being produced at high speed, large volumes and with a wide variety of formats. Data with such features is called big data. Extracting, processing, and visualizing the huge amount of data, today has become one of the concerns of data science scholar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Applied sciences

سال: 2022

ISSN: ['2076-3417']

DOI: https://doi.org/10.3390/app12073503